智能论文笔记

IT/IST/IPLeiria Response to the Call for Proposals on JPEG Pleno Point Cloud Coding

André F. R. Guarda , Nuno M. M. Rodrigues , Manuel Ruivo , Luís Coelho , Abdelrahman Seleem , Fernando Pereira

分类：计算机视觉

2022-08-04

本文档描述了基于深度学习的点云几何编解码器和基于深度学习的点云关节几何和颜色编解码器，并提交给2022年1月发出的JPEG PLENO点云编码的建议。拟议的编解码器是基于最新的。基于深度学习的PC几何编码的发展，并提供了呼吁提案的一些关键功能。拟议的几何编解码器提供了一种压缩效率，可超过MPEG G-PCC标准和胜过MPEG的效率，或者与V-PCC Intra Intra Interra Interra Intra标准的竞争力均超过了jpeg呼叫提案测试集；但是，由于需要克服的质量饱和效应，关节几何和颜色编解码器不会发生同样的情况。

translated by 谷歌翻译

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Nuno M. Guerreiro , Pierre Colombo , Pablo Piantanida , André F. T. Martins

分类：自然语言处理 | 机器学习

2022-12-19

Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications. However, NMT models can unpredictably produce severely pathological translations, known as hallucinations, that seriously undermine user trust. It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the problem of hallucination detection in NMT by following a simple intuition: as hallucinations are detached from the source content, they exhibit encoder-decoder attention patterns that are statistically different from those of good quality translations. We frame this problem with an optimal transport formulation and propose a fully unsupervised, plug-in detector that can be used with any attention-based NMT model. Experimental results show that our detector not only outperforms all previous model-based detectors, but is also competitive with detectors that employ large models trained on millions of samples.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task

Ricardo Rei , Marcos Treviso , Nuno M. Guerreiro , Chrysoula Zerva , Ana C. Farinha , Christine Maroti , José G. C. de Souza , Taisiya Glushkova , Duarte M. Alves , Alon Lavie

分类：自然语言处理 | 机器学习

2022-09-13

我们介绍了IST和Unmabel对WMT 2022关于质量估计（QE）的共享任务的共同贡献。我们的团队参与了所有三个子任务：（i）句子和单词级质量预测；（ii）可解释的量化宽松；（iii）关键错误检测。对于所有任务，我们在彗星框架之上构建，将其与OpenKIWI的预测估计架构连接，并为其配备单词级序列标记器和解释提取器。我们的结果表明，在预处理过程中合并参考可以改善下游任务上多种语言对的性能，并且通过句子和单词级别的目标共同培训可以进一步提高。此外，将注意力和梯度信息结合在一起被证明是提取句子级量化量化宽松模型的良好解释的首要策略。总体而言，我们的意见书在几乎所有语言对的所有三个任务中都取得了最佳的结果。

translated by 谷歌翻译

A Robust Scientific Machine Learning for Optimization: A Novel Robustness Theorem

Luana P. Queiroz , Carine M. Rebello , Erber A. Costa , Vinicius V. Santana , Alirio E. Rodrigues , Ana M. Ribeiro , Idelfonso B. R. Nogueira

分类：机器学习

2022-09-13

科学机器学习（SCIML）是对几个不同应用领域的兴趣越来越多的领域。在优化上下文中，基于SCIML的工具使得能够开发更有效的优化方法。但是，必须谨慎评估和执行实施优化的SCIML工具。这项工作提出了稳健性测试的推论，该测试通过表明其结果尊重通用近似值定理，从而确保了基于多物理的基于SCIML的优化的鲁棒性。该测试应用于一种新方法的框架，该方法在一系列基准测试中进行了评估，以说明其一致性。此外，将提出的方法论结果与可行优化的可行区域进行了比较，这需要更高的计算工作。因此，这项工作为保证在多目标优化中应用SCIML工具的稳健性测试提供了比存在的替代方案要低的计算努力。

translated by 谷歌翻译

A new Reinforcement Learning framework to discover natural flavor molecules

Luana P. Queiroz , Carine M. Rebello , Erbet A. Costa , Vinícius V. Santana , Bruno C. L. Rodrigues , Alírio E. Rodrigues , Ana M. Ribeiro , Idelfonso B. R. Nogueira

分类：机器学习

2022-09-13

味道是遵循社会趋势和行为的风味行业的焦点。新调味剂和分子的研究和开发在该领域至关重要。另一方面，自然风味的发展在现代社会中起着至关重要的作用。鉴于此，目前的工作提出了一个基于科学机器学习的新颖框架，以在风味工程和行业中解决新的问题。因此，这项工作带来了一种创新的方法来设计新的自然风味分子。评估了有关合成可及性，原子数以及与天然或伪天然产物的相似性的分子。

translated by 谷歌翻译

Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation

Nuno M. Guerreiro , Elena Voita , André F. T. Martins

分类：自然语言处理 | 机器学习

2022-08-10

尽管神经机器翻译（NMT）中幻觉的问题受到了一些关注，但对这种高度病理现象的研究缺乏坚实的基础。以前的工作在几种方面受到限制：它通常诉诸于放大问题的人工环境，它无视一些（常见的）幻觉类型，并且不能验证检测启发式方法的充分性。在本文中，我们为研究NMT幻觉的研究设定了基础。首先，我们在自然环境中工作，即没有人造噪声的内域数据，既不在训练中也没有推理。接下来，我们注释一个超过3.4K句子的数据集，指示不同类型的关键错误和幻觉。然后，我们转向以前使用的检测方法和两种重新访问方法，并建议使用基于玻璃盒的不确定性检测器。总体而言，我们表明，对于预防性设置，（i）先前使用的方法在很大程度上不足，（ii）序列对数概要性效果最好，并且与基于参考的方法相同。最后，我们提出了脱足素剂，这是一种减轻测试时间的简单方法，可大大降低幻觉速度。为了简化未来的研究，我们发布了用于WMT18德语英语数据的注释数据集以及模型，培训数据和代码。

translated by 谷歌翻译

Severity classification in cases of Collagen VI-related myopathy with Convolutional Neural Networks and handcrafted texture features

Rafael Rodrigues , Susana Quijano-Roy , Robert-Yves Carlier , Antonio M. G. Pinheiro

分类：计算机视觉

2022-02-28

磁共振成像（MRI）是用于低渗透神经肌肉疾病临床评估的非侵入性工具。自动诊断方法可能会减少对活检的需求，并提供有关疾病随访的宝贵信息。在本文中，提出了三种方法，以根据胶原蛋白VI相关的肌病病例对目标肌肉进行分类，这些方法是根据它们的参与程度（尤其是卷积神经网络），一个完全连接的网络来对纹理特征进行分类，并结合了两种特征套。对26名受试者的轴向T1加权涡轮自旋Echo MRI进行了评估，其中包括乌拉里奇先天性肌肉营养不良症和伯特莱姆肌病患者在不同的进化阶段。对于健康，轻度和中度/严重的病例，混合模型的全球精度分别为93.8％，其全球精度分别为0.99、0.82和0.95，获得了最佳的交叉验证结果。

translated by 谷歌翻译

Semi-Supervised Graph Attention Networks for Event Representation Learning

Joao Pedro Rodrigues Mattos , Ricardo M. Marcacini

分类：机器学习

2022-01-02

新闻和社交网络的事件分析对于广泛的社会研究和现实世界应用非常有用。最近，已经探索了事件图形的事件图形和它们的复杂关系，其中事件是连接到表示位置的其他顶点的顶点，人们的名称，日期和各种其他事件元数据。图表表示学习方法是有希望从事件图中提取潜在特征，以实现不同的分类算法。但是，现有方法无法满足事件图表的基本要求，例如（i）处理半监控图形嵌入以利用一些标记的事件，（ii）自动确定事件顶点和它们元数据顶点之间关系的重要性以及处理图形异质性的（iii）。本文介绍了GNEE（GAT神经事件嵌入品），这是一种与图形关注网络和图形正规化的方法。首先，提出了事件图规范化以确保所有图形顶点接收事件特征，从而减轻图形异质性缺点。其次，利用自我关注机制嵌入的半监控图形认为现有标记事件，并在表示学习过程期间了解事件图中关系中的关系。具有五个真实世界事件图和六个图形嵌入方法的实验结果的统计分析表明，我们的GNEE优于最先进的半监督图形嵌入方法。

translated by 谷歌翻译

Conservation Tools: The Next Generation of Engineering--Biology Collaborations

Andrew Schulz , Cassie Shriver , Suzanne Stathatos , Benjamin Seleb , Emily Weigel , Young-Hui Chang , M. Saad Bhamla , David Hu , Joseph R. Mendelson III , .

分类：机器学习

2023-01-03

The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.

translated by 谷歌翻译